Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition
نویسندگان
چکیده
Automatic speech recognition (ASR) is one of the most demanding tasks in natural language processing owing to its complexity. Recently, deep learning approaches have been deployed for this task and proven outperform traditional machine such as Artificial Neural Network (ANN). In particular, deep-learning methods long short-term memory (LSTM) achieved improved ASR performance. However, method limited continuous input streams. Traditional LSTM requires four (4) linear layers (multilayer perceptron (MLP) layer) per cell with a large bandwidth each sequence time step. cannot accommodate many computational units required streams because system does not sufficient feed units. study, an enhanced recurrent neural network (RNN) model was proposed resolve shortcoming. model, RNN incorporated “forget gate” block allow resetting states at beginning sub-sequences. This enables process efficiently without necessarily increasing bandwidths. standard architecture modified effectively use parameters. Some CNN-based sequential models were used on same dataset, compared model. LSTM-RNN outperformed other accuracy 99.36% well-established public benchmark spoken English digit dataset.
منابع مشابه
Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition
Long Short-Term Memory (LSTM) is a recurrent neural network (RNN) architecture that has been designed to address the vanishing and exploding gradient problems of conventional RNNs. Unlike feedforward neural networks, RNNs have cyclic connections making them powerful for modeling sequences. They have been successfully used for sequence labeling and sequence prediction tasks, such as handwriting ...
متن کاملLong short-term memory based convolutional recurrent neural networks for large vocabulary speech recognition
Long short-term memory (LSTM) recurrent neural networks (RNNs) have been shown to give state-of-the-art performance on many speech recognition tasks, as they are able to provide the learned dynamically changing contextual window of all sequence history. On the other hand, the convolutional neural networks (CNNs) have brought significant improvements to deep feed-forward neural networks (FFNNs),...
متن کاملRobust speech recognition using long short-term memory recurrent neural networks for hybrid acoustic modelling
One method to achieve robust speech recognition in adverse conditions including noise and reverberation is to employ acoustic modelling techniques involving neural networks. Using long short-term memory (LSTM) recurrent neural networks proved to be efficient for this task in a setup for phoneme prediction in a multi-stream GMM-HMM framework. These networks exploit a self-learnt amount of tempor...
متن کاملFrom Recurrent Neural Network to Long Short Term Memory Architecture
Despite more than 30 years of handwriting recognition research, Recognizing the unconstrained sequence is still a challenge task. The difficulty of segmenting cursive script has led to the low recognition rate. Hidden Markov Models (HMMs) are considered as state-of-theart methods for performing non-constrained handwriting recognition. However, HMMs have several well-known drawbacks. One of thes...
متن کاملSpeech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks
Long Short-Term Memory (LSTM) recurrent neural network has proven effective in modeling speech and has achieved outstanding performance in both speech enhancement (SE) and automatic speech recognition (ASR). To further improve the performance of noise-robust speech recognition, a combination of speech enhancement and recognition was shown to be promising in earlier work. This paper aims to expl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2022
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2022.3159339